Expediting RL by Using Graphical Structures (Short Paper)
نویسندگان
چکیده
The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend to be much slower than planning algorithms, which require the model as input. Recent results demonstrate that MDP planning can be expedited, by exploiting the graphical structure of the MDP. We present extensions to two popular RL algorithms, Q-learning and RMax, that learn and exploit the graphical structure of problems to improve overall learning speed. Use of the graphical structure of the underlying MDP can greatly improve the speed of planning algorithms, if the underlying MDP has a nontrivial topological structure. Our experiments show that use of the apparent topological structure of an MDP speeds up reinforcement learning, even if the MDP is simply connected.
منابع مشابه
Faster Dynamic Programming for Markov Decision Processes
Markov decision processes (MDPs) are a general framework used in artificial intelligence (AI) to model decision theoretic planning problems. Solving real world MDPs has been a major and challenging research topic in the AI literature, since classical dynamic programming algorithms converge slowly. We discuss two approaches in expediting dynamic programming. The first approach combines heuristic...
متن کاملLearning Efficient Representations for Reinforcement Learning
Markov decision processes (MDPs) are a well studied framework for solving sequential decision making problems under uncertainty. Exact methods for solving MDPs based on dynamic programming such as policy iteration and value iteration are effective on small problems. In problems with a large discrete state space or with continuous state spaces, a compact representation is essential for providing...
متن کاملA Graphical Interface Formalism: Specifying Nested Relational Databases
Jan Paredaens University of Antwerp An interface is considered as an automaton, for which the dynamics are represented by transitions on a set of states. Part of the actual state is currently represented by the screen. As such a program in Rl represents the possible dialogue between the user and the system. An overview of Rl, a language for specifying interfaces, is given. Rl is illustrated by ...
متن کاملNew Applications on Linguistic Mathematical Structures and Stability Analysis of Linguistic Fuzzy Models
In this paper some algebraic structures for linguistic fuzzy models are defined for the first time. By definition linguistic fuzzy norm, stability of these systems can be considered. Two methods (normed-based & graphical-based) for stability analysis of linguist fuzzy systems will be presented. At the follow a new simple method for linguistic fuzzy numbers calculations is defined. At the end tw...
متن کاملFour Applications of Graphical Models in Machine Learning
Conditional random fields for multi-agent reinforcement learning Conditional random fields (CRFs, [1]) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically dis...
متن کامل